XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses
نویسندگان
چکیده
Recently, a large number of XML documents are available on the Internet. This trend motivated many researchers to analyze them multi-dimensionally in the same way as relational data. In this paper, we propose a new framework for multidimensional analysis of XML documents, which we call XML-OLAP. We base XML-OLAP on XML warehouses where every fact data as well as dimension data are stored as XML documents. We build XML cubes from XML warehouses. We propose a new multidimensional expression language for XML cubes, which we call XML-MDX. XML-MDX statements target XML cubes and use XQuery expressions to designate the measure data. They specify text mining operators for aggregating text constituting the measure data. We evaluate XML-OLAP by applying it to a U.S. patent XML warehouse. We use XML-MDX queries, which demonstrate that XML-OLAP is effective for multi-dimensionally analyzing the U.S. patents.
منابع مشابه
XML Multidimensional Modelling and Querying
As XML becomes ubiquitous and XML storage and processing becomes more efficient, the range of use cases for these technologies widens daily. One promising area is the integration of XML and data warehouses, where an XML-native database stores multidimensional data and processes OLAP queries written in the XQuery interrogation language. This paper explores issues arising in the implementation of...
متن کاملIntegrating Data Warehouses with Web Data for Olap Using Semantic Data Clustering Techniques
Nowadays, Information retrieval plays an important role in the web. Many researches presented techniques for information retrieval process from databases. The previous work presented extended tree pattern clustering process for XML massive storages. This paper presents a new technique termed semantic data clustering (SDC) technique for combining the Data warehouse and web data for OLAP by retri...
متن کاملMeta Cube-X: An XML Metadata Foundation for Interoperability Search among Web Data Warehouses
OLAP (Online Analysis Processing) applications have very special requirements to the underlying multidimensional data that differs significantly from other areas of application (e.g. the existence of highly structured dimensions). In addition, providing access and search among multiple, heterogeneous, distributed and autonomous data warehouses, especially web warehouses, has become one of the l...
متن کاملFragmenting very large XML data warehouses via K-means clustering algorithm
XML data sources are more and more gaining popularity in the context of a wide family of Business Intelligence (BI) and On-Line Analytical Processing (OLAP) applications, due to the amenities of XML in representing and managing semi-structured and complex multidimensional data. As a consequence, many XML data warehouse models have been proposed during past years in order to handle heterogeneity...
متن کاملMultidimensional Anlaysis of XML Document Contents with OLAP Dimensions
With the emergence of Semi-structured data format (such as XML), the storage of documents in centralised facilities appeared as a natural adaptation of data warehousing technology. Nowadays, OLAP (On-Line Analytical Processing) systems face growing non-numeric data. This chapter presents a framework for the multidimensional analysis of textual data in an OLAP sense. Document structure, metadata...
متن کامل